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AB S TRACT 

The multilevel characteristics of test item data are 
considered as a method for examining the characteristics of 
standardized norm-referenced tests. A theoretical rationale for 
examining multilevel characteristics is presented. It can be used as 
an aid to understand why program and instructional effects on 
measures constructed from individual-level psychometric data are 
weak, t6 improve instructional sensitivity and prpgram relevance of 
tests, and to indicate /lihat features of a test will increase the 
sensitivity of the test'^to instructional program variables. An 
empirical example formS:he International Evaluation of Education 
Achievement study and an analysis of one test from the Beginning 
Teacher- Evaluation Study are examined to better understand how 
multilevel analysis can lead to more informed use of item data. It 
was found that between-student analysis fails to take into account 
the instructional context'and its efiEect on student item response. 
This failure has two effects. First, the relationship between item 
response and other variables cannot be explained, since it is a 
conglomerate of two different processes. Second, the between-student 
analysis may give a distorted vi'ew of whether an effect does or does 
not exists (Author/PN) 
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Multilevel Properties of Test Items: An Exploratory Study 
M. David Miller and Leigh Burstein 

Because of the belief that schooling affects student outcomes, the 
' , largely negative results=from school effects studies and large scale 
evaluations of the relationship of school .inputs to student outcomes 
have caused educational researchers to reexamine the statistical techniques 
an-d models which have traditionally been used to arrive at these conclu- 
- osions. One metho'dological issue that has received much criticism has 

been the use of standardized norm-referenced achievement tests as the sole 
measure of educati onal" "outcomes . ; - 

Rather than abandon norm-referenced tests, an analysis of the tests may 
reveal ways to Improve them. Apo^ble method for examining the charac^ 
teristics of -standardized norm-referenced tests might be a multilevel 
examination of test items. Cronbach C1976) wafthe first to discuss the 
*' possible utility of multilevel item analysis: 

Once the question of units is raised, all empirical test construe- 
tion and item-analysis procedures need to be reconsidered. Is it 
better to retain items that correlate across, classes? Or items^ 
that correlate within classes? A correlation based on deviation 
scores within classes indicates whether students who comprehended 
one point better than most students also comprehended the second 
point better than most — instruction being held constant. A 
correlation between classes indicates whether a class that leart?|d 
one thing learned another, but this depends first and, foremost 
on what teachers assigned and emphasized. It js the items that 
teachers give different we^ight to that have the greatest variance 
across classes. This {differential emphasis) leads us to regard 
■ ' d ■ ■ ■ \ 
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the between-group and within-group correlations of items as con- 
veying different information, and makes the overall correlation for 
classes pooled an uninterpret^e blend. (Cronbach, 1976, pp. 9.19- 
9.20) • ' 

The effects 'that Cronbach cites need to be better understood. By 
considering the multilevel characteristics of test item data, test 
developers and users could potentially become better informed about test 
development, analysis, usage, interpretation, and reporting. For example, 
some test items may be more sensitive to background effects te,g., prior 
knowledge or socioeconomic status), while other items may be more sensitive 
to instructional and program variables (e.g., time allocated per content 
area, time spent'on high or low success tasks). By learning what variables 
an item is sensitive to, test developers- will be better equipped to guard 
against unknowingly selecting items which are influ'enced by irrelevant 
characteristics of the environment in which the test is administered 
(irrelevant tothe purposes for which the test is developed). Perhaps 
test constructors will also be able to better select items for a test 
'which are more sensitive to the variable of interest (e'^g., amount 
learned). At the least, multilevel analyses of test items will -help 
test developers to better describe the statistical properties of the test 
and its items. 

This report will be divided into four sections. In the first section, 
a theoretical rationale for examining the muTfi level characteristics will 
be sketched. In the second section, an empirical "example from the Inter- „ 
national Evaluation of Educational Achievement study (lEA) will be examined. 

c 

Next, a preliminary analysis of oni& test from the Beginning Teacher Eval- 
uation Study (BTES) will be presented. Finally, the potential utility of 



of multilevel item analysis and some possible directions for further 
research will be discussed. 

Multilevel Analysis 

o 

The educational system is inherently multilevel. That is, schools 
are nested Within districts; classes are nested within schools; and 
Students are nested within classrooms. Data .analysis can be conducted 
both between and within each of the various levels of the educational 
system. Furthermore, analysis between and within different levels can 
have different substantive meanings (Burstein, 1978; Burstein, Fischer, 
and Miller, 1979; Cronbach, 1976). Recognizing the importance of the 

choice of a unit of analysis, mdj^or evaluations, such as Follow Through - 

" • ' 

(Hane^, 1974) and .the National Day Care Study (Singer and Goodrich, 

1979), have considered this issue in some detail. Since education can 
affect students between and within all level? of the educational system, 
it has been argued ""that evaluations of educational data should look at 
more than one levefl of analysis for a more complete understanding of the 
determinants of student achievement. 

' Cronb&ch (1976) has argued that the "majority of studies of educational 
effects — whether classroom experiments, or evaluations of^ programs 
or surveys — have collected and analyzed data in ways that conceal more 
than they reveal. The established methods have generated false conclu- 
sions in many studies" CCronbach, 1976, p. 1). Schooling effects studies, 
have traditionally selected one unit of analysis, such as the individual 
or the school, and have used a between unit analysis. However, given the 
intact nature 6f educational data, single-level analyses are^often inappro- 
priate; the. individual-level analysis can^be decomposed into a between-group 



analysis and within-group analysis. It has been shown that the correlation 
of two variables at the individual -level is a weighted combination of the 
between-group correlation and the pooled within-^r-oup correlation (Knapp, 
1977; Robinson, 1950): ' * 

= nxHyPxY + - 1^ - 'ly P(X-'Y)(Y-7) ' . 
where p^y is the* correlation of X and Y across individuals; p^^ is the 
correlation 'of X and Y for the weighted group means; P(x.x)(y-?) ''^ 
correlation of the individuals deviationsVrom their group means on X 
and Y; and nx and ny are the proportion- of variance in X and Y, respec- 
tively, that is attributable to group differences. 

It is also" true that .the individual -level regression coefficient 
can-tbe decomposed into a weighted combination of a between-group coefficient 
and a pooled within-group regression coefficient (Duncan,' Cuzzort, and 
Duncan, 1961): - . . 

^t = Vx " '^x^ . ' . ' 

where 3.^Js calculated by regressing t'he individual level dependent 
measure (Y). on the individual level" independent measure (X); is cal- 
culated by regressing the weighted group means of th^ dependent measure 
(?) on the weighted group means of, the independent measure (X); is 
caieulated by regressing the dependent measure ^deviations from the ^foup 
means (Y - Y) on the independent measure deviations from the group 
means (X - X); and'nx is as defined above. As would be expected, when 
the 'influence -of the group is weak, Hx approaches zero and approaches 
8^. Conversely, when the differences on the independent measure are 
largely attributable to group differences, nv approaches 1 ..0 and 
approaches 3^^. 



Often the dec^position of tKe student-level analysis in educational 
research has been ignored. This falur*e to take into account the multilevel 
'properties of the data has often caused educational researchers to arrive 

V 

at misleading conclusions about the effects of various determinants 'of « 

educational achievement (Burstein, 1978;' Burstein, Linn, and Capell, 

1978; Burstein and Miller, 1978; Cronbach, J976; Crohbach and Wei)b, 1975). 

It is possible "that the examination of data from a multilevel perspective,, 

which has too often been absent in other aspects of school effects studies 

and program evaluations, mijht also help us to better understand -why 

» 

program and instructional effects on measures constructed from individual- 
TeveV psychometric data are weak. PerhSps a multilevel perspective appTiejl 
to test development and interpretation will help to- imprdve the instruc- 
tional sensitivity and program relevance of tests. It is possible that the 
multilevel characteristics of item data will show what features of a test- 
will ii:idrease the sensitivity of, the test to instructional and program 
vari ablest » , ^ . 

item Analysis ' ' 

In order to better understand the effects mentioned by Cronbach (1976) 
and y/hat might be gained from a multilevel analysis of ""item data, it is , 
important to be aware of classroom and background processes and how they 
effect, differences, in between-class and within-class achievement. Cronbach 
*Cl976) suggested that items that correlate highly across classes should 
be indicative^ of instructional and program effects. If some teachers ^\ 
emphasize a given content, area, such as fractions, and others do not, 
one would expect high correlations across classes of items from a test 
measuring that given content area. On the other hand, if items correlate 
positively within classets, it indicates that students who do well on an 



Ke. relative to the Other" ^enbers of the cl.ss w<ll also do well on other 
item4. This" effect niight b^^due to differences fn students along such 
dimeisions as ability or motivation, s , . 

7 Thfi- variance of an item can a>s:a Be .partitioned into two independent . 
4ponents beWn-cl ass- variation and within-ilass variation, the 
betw^n-class Wittion of ah item can .be iTidicatiye of instructional , ' 
and'prqgram variables* rf teacners spend different jimounts of time in 
^ specific- content area or they differ^tn their enthusiasnVfor that_ .>l_ 
content area, there could'bp-a tigt effect on^he claiss which could increase ^ 
Jthe between-class variance, of an item from' that content area. Similarly, 
the withiB-class variance could be affected by instructional and program 
variables, but with a different substantive interpretation. Whi l.e- the .between 
clas-s variability can be°ihfluenced by the net effect of classroom and 
'instructional variables averaged jicross s;tudents,' the within^-class variability 
could. repj'es.ent differential sensitivity of students within a classroom 

to instructional and program variables. For example, students who . 

■ ' , *^ 

are active participants in their learning might learn more from a given 

■* , . ■ * 

program than students' who are passive learners. Additionally^ within- 
class, variability might- represent diflerences in an instructional or . 
program variable within the class". For example, teachers may spend jno re 
time with some students than -others, or time on task may vary within the 

classroom. ^ ^ - . - • 

Finally, the b6tween-c'lass and within-class components may also 
be affected by background variables. The between-class component may bf: due 
:3:o*d1ffer1hg'c6nSiijn1ty"charaVte^^^ Ce.nT, sbcioecohoraic status), such as. 
the effects of differences across classes in the abilities and backgrounds^ 
of students. The within-class com'ponent may reflect the differing abilities 



df students withfn'the class,;differendes. in learning rates, or differences 
in the students '%eactions^ to different instructional ifiethodf , 

\ fmpir.ixal Examples - • .. 

" In order to better understand how multilevel analysis can lead to 
more informed use of item'data, data from two purees will be examined. 
• The'jFirst example involves 'a :biologXsubtest.-.from the International 
Associkti'on for the EvaluaticSi- of .Educi^tional, Achievemftnt (IFA) 5ix ' 
Subject Survey.^ These data were'^nalyzecl. previously by McLarty '(igyg) , 
The second example is drawn from the Beginni^ng Teacher Evaluation Study 
(BTES)." 



tEA Biology Items ' ' . 

lEA collected data from 21 countries across six subject areas. 

ft 

Science wjas considered because ft was a subject that was potentially less 
influenced by sources outside of the school environment. Information on 
the .development of the science test items is available in;Comber and * ^ 
Keeves .(1973). Data on the results of th^ science test in the United 
States is also available' in WoTf C1977). ^ 

In order to narrovf the focus of 'the analysis, the data from i^orm B . 
of the Biology subtest for Population II (14 year olds) in the United 
'states were examined. The nine test items (numbered "2 through 10, a's. on 
the test) are contained in Appendix A. For data management and economic 
reasons, McLarty (1979) selected a random sample of schools (schools with 
less than 20 cases Were eliminated first). The sample actually used for 
multilevel item analysis included 1210 students in 50 schools. 

The descrtptive statistics for the items 'are contained in Table 1. 
Since.this test was developed using^ traditional psychometric techniques. 



the individual differences ace jnaximfzed rather than the school differe'nces 

(i.e.,' SD (within) > ,SD(fc|e.tween)). This^has the likely effect of'yielding 

small n^. Note^that the proportion of variation accounted for the 

schoo]^'i*anges from *6' percent on' I'temMO to. only 10 percent on item 8. 

The itemNn.tercorrelations are contained in Table 2. As McLarty 

(1979) paints out^ the low witMn-^school cbrrft^^ations are probably due 

to the nature of the construct being measured. .Biology covers a wide 

range of subjects including botany, zoology, and chemical i^ocesses 

^involved in the life cycle (e.g., photosynthesis). Thu.s, it is conceivable 

>that a student might learn the material necessary for. one item and not 

another, depending on what area of biologyM:he student is interested in. , 

^0 that knowing one biology item is not necessarily relateth4:o -knowing 

another. Wftat relationship there is between items s^ms to be due to 

between-schWl differences. Th'e school. average on one item seems to be ^ 

highly related to the school average on another item."- These results are 

jalso confirmed in the point bi serial correlations of Table 3. - 

♦ ^ » » ' 

In Tables 4 through 9, item responses Were regressed' on -aii ^ 

variables. Jhe following independent me'asures were used: 

1. Student Sex - Ifmale- an(i.2=iFemal el . ■ 

2. Raw Word Knowledge - score on 40 item' vocabulary test; \^ 

^ ' - ' ' \ 

3. diking of Bijlogy -^five point 'ascending "scale of a student's \ 

^ ■ rating of each school subjecj; ' •' » 

.4. ,Books*in the Home - l=hone, 2=1-30; 3=11-25, 4=26-50, and 5= 51 
• or more; . • ' ' 

5. Hours of Biology Instruction per Week - l=^o not take, 2=1 ess than" 
1 hour, 3=less than 3 hours, 4=less than 5 hours, and 5= more 
than 5 hours; and 



6. Hours of Biology-Homework, .per Week - 1-none, 2=1 ess than 1 hour, 
3=less than 3 hour?, 4»less than^'5 hours, -and 5=more than s'^urs. 

Each table contains 'three rows of regres'sion coefficients ^corresponding ^ 

t 

to two. regression, equations: s ■ . 
Y = a + b^X . • . ' ' 

The -first row of regression coefficients (total) is derived from the' first 
.equation - the regression\f "student item sctf^es CY) on^the student 
variables mentioned above (,X). The reiMining two rows come from the* 
'second equation - the regression of student item scores (Y) on the school 

meansv,fo'r the variable 01.) and the studentylevel measure (X). The. two 

^ ■ ■ ' \ ■ , ^ 

coefficients b, and b.j are interpreted as the. betweenrschoo^ effec-t after 

controlling for the individual level measure and the wi thin-school effect, 

respectively-C:See-Alwln,./1976; Burstein, 1978; F-irebaugh, 1978 -for evidence 

- ■ - •■■ 

on the interpretation). . ' v.. . " ■ v ^ 

Jhe implications from Table 4 are fairly ^straightforward. For items 
3, 4, 8, and 9, males will score higher than females, in. the same school. 
For iteni'5,.the gpposfte effect was found >(within tne same school, females 

*will score higher than males). Furthermore, for items 2, 4, 7, and '8, 
the between-school coef'^icients sLggest that schools with-a ,higher ratio 
of rtales to females will perform Tiigher than schools with a lower ratio. 
Thes6 coefficients may represent sex bias at different levels. Scientists 
have traditionally been viewed as -a m'ale role. Possibly, tWs expectancy 
of different roles for males and females, can be seen through differences 

. in instruction.'' classes, with more males may rec^ve more. science instruc- 
tion and encmjraggraent.. In addition, within the classroom, males may 
receive more help and encouragement from the teacher than their female^ 
classmates. ^ . 

^ ' . -'""fr-^ ■ — 



Much of the information from Table 4 is lost throug^i examination of 

Sat * 

the between-student (Total) coefficients. First, when there was only a 

between-schooj difference, the between-student coefficients were no-t 

^-^.^yg^^figftive enough to f ind^ny differences. Citems,2 and 7)t Secondly, in 

/one case (item 10), the between-student analysis found an effect from the 

combination' of two non'signtficant effects (between-school and within- 

• ' \ ' * 

schoolj. The Interpretation of this and any other significant^^tween- 

student coefficient is riot as' straightforward as the interpretation of 

"the multi-level c(^fficients. As Cronbach*(T976) points out, the between- 

Student analysis is . often an un inter pre table blend of the between-school 

and within-school analyses. 

The raw word knowledge test used in Table 5 can be interpreted as 

a measure of verbal ability. The positive coefficients in the table show 

two things*. For items 3 through. 10, the wi-thin-school coefficients show 

that students who are higher in verbal ability than their schoolmates 

* ^^ar^-more likely to answer the item -correctly. In addition, for items 

" 3, 4^ 6, 7, and 9, schools with a higher niean verbal ability did better 

* on the item than schools with a lower mean verbal abilily. This suggests 
• *» * 

that the test may require a high level of verbal abiljty. An inspection 

of the Items shows that they* 'do require a fairly/high level of reaJing- « 

proficiency. The largely verbal format of the test may require^, as much 

verbal ability a^ biology. However, it is also possible Ifiit students 

who excel In one academic area (e.g., verba^l ability) also excel in other 

areas (.e.g., biology). . ^ 

* ^ Tables 6 and 7 can be Interpreted, in- a similar manner. In Table 

■ ei.nVing of biology. Is an attitude .indicator. In Table 7, number of 
A» ■ ♦ . < •' 

bookis In the home can be seen as an Indicator of socio-economic status. 



^ 11 . 
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Th'e following results were drawn from Tables 6 arid 7. Schools with higher 
mean attitude toward biology did better on items 3 and 5. Hbwever, most ^ 
of the items were more sensitive to wi thin-school attitudes (items' 3, 
4, 6-10). That is, students that liked biology more than. their peers 
were also more likely to respond correctly to theritems. .Finally, schools 
with a higher average socio-economic status did better on most items (3-10) 
and students with a higher socio-economis status- than their peers did 
better on items 3,4, and 6, ' ^ ' 

The direction of the regression coefficients is consistent with 
prior findings about the relationship of socio-economic status, liking 
of subject matter, and verbal ability to student achievement. That is, 
schools containing students with" a more positive attitude toward the 
subject-matter, a higher mean socio-economic status or a higher mean 
verbal ability were more likely to exhibit higher achievement. In addition 
students who were higher than their peers on the three VctViables were p 
more likely to achieve higher than their peers. However, i.te|ns are 
differentially sensitive to different variables. For example, item 2 
is only sensitive to between-school sex differences; whereas, item 4 is 
sensitive to within-school differences on all four variables and between- 
school differences on three of the four variables: Also, examination of 
the between-student coefficients will not reveal 'the various processes. 
For example, on item 7, the total coefficient on liking of biology, books 
*in the home, and raw word knowledge represents within-school ■ differences , 
between-school differences, and the combination of between-school and 
within-school differences, respectively. 

Finally, in Tables 8 and 9, two school variables are used to predict 
item response: hours of instruction, and hours of homework As can be 

.. u ■ . 



seen from items 3. 41 6. 7. 8,. and 10, t'he more instruction a student 
receives relative to Kis/her peers, the higher the student will achieve^ 
relative to his/her schoolmates. The amount of homework also had a 
^positive effect both between-school and within-schooV. Item 3 shows that 
the more biology homework that is done across the.school, the higher the 
schbol mean will be for this item. For items 4, 6. 7, 8, and 10, more 
homework by the student results in higher achievement than his schoolmates 
with less biology homework. Apparently, the amount of instruction and 
homework do effect student achievement within the school. 

BTES 

The Beginning Teacher Evaluation Study was sponsored by the 
California Copission for Teacher. Preparation and Licensing with funds 
from the National Institute of Education. The study was conducted to 
examine the relationship between instructional variables and achievement 
in reading and mathematics in grades 2 and 5. Of particular interest 
to. this paper was the learning of fifth grade mathematics - a" subject 
area in which a great deal of time and effort are put into teaching 
fractions. Tests were administered to six student i-n each of 25 second 
and 25 fifth grade .classes oh four occasions — (A) October, 1976; 
(B) December. 1976; (C) May,. 1977; and (D) September.' 1977. 'in addition 
to the achievement tests, measures of allocated time^jngagjment^ — 
and success rates were obtained. Students were selected for not being 
■extremely low or extremely high in. ability (roughly 30 to 70 percentile). 
This restriction in range of entering student ability, combined with the 
care taken to measure instructional variables and the development of 
instructionally 'sensitive tests, makes this data set an interesting 



example for examining the relationship between the multilevel character- 
■ istics of items and instructional and program variables. 

While the lEA data did have some instructional and school process 
— variables,^ the BTES is especially noteworthy for their efforts to develop 

instructionally sensitive instruments (BTES, Filby and Dishaw, 1975, 1976). 
Since the goal of BTES was to understand the rel ationship 'between instruc- , 
tional variables and student achievement, special efforts were made to 
develop tests which would be reactive to instruction. The re.searchers 
felt that tests us6d to evaluate instructional processes must be sensitive 
indicators of classroom learning. Test items were checked for content 
yaiidity to be sure that test content and instructional content overlapped. 
Then, items were checked to. see. whether gains were related to instruction 
(Carver, 1974). In their analysis, Filby, and Dishaw assumed that students * 

o • 

would perform" better on an item after instruction than prior to instruction. 
In addition, students who receive high amounts df instruction in a given 
content area v^ere expected to perform better on items from that content 
area than students who- receive less instruction in that content area. 
Items that conformed to th'fe -two above assumptions were then selected^to 
form a reactive, sensitive measure of classroom learning. Using this . 
technique 'for test development (i .e. , „item selection), the BTES .tests 
did show a significant relationship to time allocation by content area , 
(Fischer, Filby, Marliave, €ahen, Dishaw, Moore, and Berliner, 1978") . 

' In order to focus our attention on a manageable data set, it was 
decided to work only with the fraction items of the mathematics grade 5 
Hest. This -further' reduced the data set since the fraction items were 
not given on occasion A (October, 1976). The fifteen .items from the 
fractions 'subtest tested the student's ability to identify equivalent 

c 
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fractions. The skills tested jncluded rjeducing fractions and finding 
the missing numerator or denominator in .a fractional equation. The items 
are contained in Appendix B. There were 127 cases on occasion B (December, 
1976), 123 cases on occasion C (May, 1977), and 89 cases on occasion 
D (September, 1977). The individual students were drawn from 21 classrooms. 

Besides having the instructional variables, another difference ^ 
between the BTES>analyses and the lEA^nalyses was the use of a "pretest". 
The model for^'wie BTES analysis was the' same except that two indeoendent 
j/^raibles were' used. The dependent variables were the item responses on 
occasion C. The independent variables wereUhe item responses on occasion 
B along with': 

• 1. Allocated Time - minutes allocated to 'learning fractions divided 
' by 1000, 

2. Easy Time - estimated time spent doing work that is easy for the 

student, divided by' 100, 
3; Hard Time • estimated time spent doing work that is difficult 
for the student. 

The regression equations are the same as those used in the lEA analyses,. 

except that tf^ere are. now a pair of independent variables in each equation. 
. The basic multilevel item characteristics are given in Tables 10a, 

10b, and 10c. Two features of the tables 'are especially prominent. First, 
students scored appre"daMy W^eT"^ ocwlTo'n'TThin^b^^^^^ 
sJightly lower on occasion D than on occasion C. As was expected, per- 
fomance increased after instruction and fell off over the summer vacation, 
The second feature of these tables is that the average followed the 

\same pattern as the mean response. Apparently, the same students working 
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together within a classroorii and getting roughly the same level of instruc- 
tion within a classroom result in large between-class effects, biit after, 
the class breaks up,, the between-class effect began to diminish. The 
pattern of summer loss is unrelated to class membership-. 

*In Table 11 the point biserial correlations are given. The majority 
of the "items correlated fairly highly with the subtest at all levels 
of analysis. This meant'that students who did well on an item also did 
well on the rest of the test, relative to the rest of the class. '^Iso, 
a class that scored high on the subtest was likely to get the individual 
items right. Hence, it appears that the test is fairly reliable for\ 
measuring either within-class or between-class differences. 

The .regression analyses are contained in Tables 12, 13, and 14. 
Each table is based on the prediction of item scores from the same" item\^ 
on an earlier occasion and an instructional variable. The "pretesit"- seems 
to have positive impact on both tfieDbetween-clarss and within-class analyses in all 
three tables. The positive within-class effect shows that students who 
do better than their classmates on occasion B will do better than their 
peers on occaion C. The positive between-class effect shows that classes 
that do well on the item on one occasion will also do well on the item 
on the second equation. 

benefited more from instruction. . 

f 

Instructional effects were also found to be related to item response. 
Table 12 shows^ that for item 9, there was a significant psoitive relation- • 
ship be.tween average classifX)om allocated time and item response. Classes, 
which spend niore' time learning fractions got better results on this item. 
None of the within-class coefficients were significant. We interpret this 
along with, an of .720 for* .allocated time to mean that there is not 



a great deal of variation *of tlie allocated time of different students 
wi±hin a class. Students within a class will often work on a giv 
area at the same time. However, individualization and learning'centers 
can differentiate the time allocated to different students within a class. 

In the»case of items 1, 2, 4/ 5v 8-10, and 12-15, .there. was a confoundinci 
of effects. While neither the between-class nor the within-class coeffi- . 
cients are significant, the total coefficient is significant, this 4.s_a . 

case where multilevel item analysis would have suggested a different 

* * 

conclusion than a total atialysis. Apparently, the combinatibn of the 
between-class effect and the within-class effect does suggest that 
students who are allocated more time will perform higher on the item, but 
the partitioning of "the vaViance masks the effect. This suggests that the , 
between-student analysis can also give useful information. While parti- ^ 
tioning effects into between-class effects and within-class effects often 

•may help to better" understand the classroom effects, 'the between-student 

<■ 

effects may also y.eild useful and interesting information. 

Hard time and easy time are peculiar variables in that they have 
different substantive meaning at the two levels of analysis (i.e., 
between-class and within-class). At the between-class level, the variables, 
can be interpreted as measures o% what level the class is taught at.. 
However, within-class effects can be atrributed' largely to ability. A 
student with • high ability will spend a great deal of^ time going . 
through exercises which are easy simply because 4»e knows more about the 
tasks assigned to the whole class. In contrast, a low ability student 
will find very little to be easy. ■ ' 

* - - ■ 

Tables 11 and 12 are consistent with our interpretation of the 
variables easy time and hard t4me. At the between-class level, too much 

19 . 
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» 

easy time has a negative effect for items 1 and 2, and too much hard time 
has a positive effect for item 11., Apparently, too much easy time for 
the class is detrimental to learning; whereas, more hard time may be 
beneficial to' the students. When a classroom is .taught below its* level, 
the material covered is already known and no learning* occurs. However, 
when a classroom is taught at or above its level, the class excels 
because of the challenge. The within-class results were also^ consistent 
with the above discussion. ' A positive effect within-class for items 1 , 
and 2 on easy time suggests, that students who had more time spent on. 
easy activities were the higher achieveif-s. Conversely, a negative effect 
for^ems^^and 11 on hard time suggests that students who experienced . 
more time, on diffTcuTt^activities were low achievers. 

The BTES analyses suggestr-that^here is much to be learned about 
the relationship of instructional variables arrdH-tem,jr;espons^jfrom a 
multilevel perspective. Cffects can. occur both between and withirTc-l asses 
Furthermore, some possible' different substantive meanings were given to 
between-class and within-class effects. 

Possible Utility of Multilevel Item Analysis 

Major concerns about standardized norm-referenced tests have centered 
around their program relevance and instructional sensitivity. These 
concerns are generated by the weak evidence of program and .instructional 
-ef f ec-ts-(Gol emanT-Gampbel-l',-Hobsoni--Mc-Part-l andr-Moodr-Wei nf el d , artd York , 
1966; Averch, Carroll, Donaldson, Kiesling, and Pincus, 1972) in con- 
junction with findings that test performance is higher when there is a 

my 

substaTitial overlap between test content and instructional content 
(Armbruster,' Steven, and Rosenshin, 1977; Jenkins and Pany, 1976; Madaus, 
Kellaghan, Rakkow, and King, 1979; Walker and Schaffarzik, 1974) and 
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that even the niost broadly based achievements vary substantially in content 
coverage (Porter, Schmidt, „Floden, and freeman, 1978). 

Clearly, more "effort is needed to develop instructional ly sensitive 
measures. Efforts' to developvinstructionally sensitive and program 
relevant tests have followed two lines. First, there has-been an effort to 
develop, by curricula and test analysis,, tests such that program content 
and instructional cojitent overlap with test content. Second, as in 
the BTES, investigators have attempted to develop insstruqtionally 
sensitive tests using logical empirical methods (e.g., as, discussed on 
page 13) . * However ,^hether either of these ^test develotxnent 
strategies would have a large impact on the quality of testing in schools 
is. unclear. The majority of testing currently being conducted involves ' 
either^ standardized norm-referenced tests or'sta'te assessment and competency 
testing. Typically,, the local' school district has little input to .the 
test development process and must rely on the publisher's and state 
educational agency (SEA) 'generate^cl results. 

WhiTe a large-scale development effort may not be possible, there 
does seem 'to be some virtue in developing test_^analysis strategies that 
district personnel cjrTuse to "customize" the standardized test and assess- 
ment^da:ta to their local needs. Such strategies should be within the 
technical andeconomic means of district research and evaluation staff. 

Ond way of^lttacking the problem. is to develop methods to"iiaprove 
instructional sensitiHty that test publishers and SEA'testing^agencies 
would willingly employ in their tes.t development activities. Such methods 
would have to both command the>e^ect of the applied psychometric cormiunity 
and be viewed as economically and'poKtically advantageous. 

One possible step in the right direction in the development of instruc- 
tlonally sensitive tests and' test use may be rot^d from an examfnation of^ 



the multilevel characteristics of test item data. «As has 'been seen, 
different items are sensitive to different background and clas.s processes. 
Possibly, through the use. of multilevel emalyses of item,,data, subtests 
can be formed which are more sensitive to the faetween-class or within-class_, 
process- of interest, or at least, items could be excluded from the test 
results which are insensitive to the variables of interest. 

, Conclusions 

Items can be sensitive to background and instructional variables. 

They can be sensitive either, within-groups and/or between-groups. That 

is, classrooms can have an effect on the student's response to an item. 

In addition, the relative rank of a student with respect to an instructional 

or a background variable can affect the item response. 

.Explaining multilevel effects on ttem reslponse is still at a rudi- 
nientary stage and needs to be explored further. What is clear is that a 
between-student analysis fails to take into account the instructional con- 
text and its effect on student item response. This failure has two 
effects. First, the reTationship between 'item response and other variables 
cannot be explained since it is a conglomerate of two different processes.^ 
•That is, the between^class effect and within-class effect may have different 

-substantive meanings which cannot be sorted out in a. between-student 
analysis. Second, the between-student analysis may give a distorted 

--view of whether an- effec-t-does or does not exist. "That isr.~^the^ combination; 
pf the between-cl-ass effect and within-class effect can work in opposite 
directions* to obscure an effect that does exist, or they can work in the 
same way to produce a statistically significant effect when neither source 

. Is statistically significant by itself. ♦ . " 
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Clearly, more work is needed to better understand the multilevel - 

characteristics of items. One possible avenue which may prove fruitful 

4s the, expansion of the present model. Items may relate to variables in 

more doinplex ways. A model might be built; that takes into account 

socioeconomic -status, verbal ability, a ""pretest", and instructional 

variables simultaneously. Another approach might be to examine a variety 

of indices of grouping effects for their applicability to test item data. 

Finally, the properties of subtests which might be formed using multi- 

levet item analysis should be examined. 
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Table 1: 

t Item Descriptive Data from lEA 
(N»1210) 





7 

t 






** 


2. 


Item p 


p/Iean 




Deviation 








yotal 


Between 


Within 




2 




■ .4? 


.12 • 


M 




3 ' 

t 




.48 


.15 


.45' 


.10 


1, 




.46 


.13 


.44 


.08. 




V 










If 


1 Q 


.39 


.11 


.33 


* .oa 


■ 6 




M ■ 




.47 


.07 


* 7 


.49 


.48 


* 

.15 


.46 


.10 


. " 8 




.43 


.14 " 

4 


.41 . - 


,10. 




.32 ^ , 




.14 


.44 


.09 


10 . " 


.34 


.45 


■ .11 


.43 


.06 


Total 


4,0Q 


1,82 


.7.6 


1.65 


.18 



SOURCE: McLarty, 1979. 



Table 2: 
lEA Item Intercorrelations 
XN=1210 students, 50 schools) 





2 


3 




5 


6 


7 


* 

> 

2 - ■" ■ 


(-total) 

(betv/een) 

(v/ithin) 




• 






3 


-.02 

.03 
-.03 








• 


• 


l^ 


'.05 
.19 


.16 
.66 

Ml 








'i 


'5 


.26 

.Oif 


.06 

.26 
^ .Oi^ 


.02 

.19 
.01 








6 


• .Oif 
.Oi^ 


.15 
.53 
.12 


.11 
.09 


* .Oo 

.40 
.05 






7 


.Oif 
-.01 

.05 


.16 
.58 

.12 


• .20 

.52 
.17 


.03 
.29 
.00 


.53 
.23 




8 


.10 

-.36 
.07 


.l^^ 
.33 
.12 


.40 
.08 


.08 
.19 
.07 


.15 
.25 
.15 


.14 

.30 
.11 


9 


.10 


.12 
.08 


.10 

M 
.07 


.07 
.29 
.04 


.07 
• .04 


-.11 

.41 
.08 


10 


.05 
.15 
.05 


.Oi* 
.40 


.03- 
.32 

^.01 


.06 
.52 
.03 


.09 
.08 


.08 

.28 
.06 



SOURCE: McLarty, 1979. 



Table 3. lEA corrected item - total correlation 



Item 

\ 



5 
6 
7 

&' 
9 

10 



\ 



Total 


Between 


Within 


.12 


•52 


.10 


.23 


"^.43 


• .17 


.22 


.45 


.17 


.12 


.'49 


.od 


.27 


.47 


,24 


.29 


.44 


.24* 


.28 


.47 


.25 


.22 


.45 


.17 


,.13 


.50 


.10 
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Table 4, Regression of lEA biology items on sex. 



EFFECT ESTIMATES 



Item 


■ Unstandardized 
Between Within Total 


Standardized 
Between Within 


Total 


2 


-.1§'6* 


-.025 


-.045 


-.06 


-.03 


-.05 


3 


-.068 , 


-.083* 


-.090* 

* 


-.02 


-.09 


-.09 


4 


-.219* 


-.074* 


-.197* ■ 


-.08 


-.08 


-.11 


5. 


-v006 


. .070* 


^ .070* 


-.00 


.09 


.09 


6 


-.047 


-.035 


-.040 


-.02 


-.04 


-,.04 


7 


-.198* 


-.006 


-.027 


-.07 


-.01 


-.03 


8 


-.223* 


-.104* 


-.127* 


-.08 


" -.12 


-.15 


. 9 


-.071 


-'070* . 


-.078* 


'-.02 


-,08 


-.08 


10 


" .007 


.053 


.054* 


.00 


.06 


,06 



* Coefficient exceeds :.tw^ce. its standard error. 
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Table 5. Regression of lEA btology items on RaMMm^ Knowledge. 



EFFECT ESTIMATES 





<• 


Unstandardized 




Standardized 




Item 


Between 


Within 


- Total 


Between 


Within 


iota 1 


2 


.009 


.003 


.005 


.05 


.03 


.05 


3 


.018*' 


.018* 


.022* 


. .09 


. .19 ' 


.24 


4 


.013* 


.018* 


.021* 


.07 


.22 


,25 

* 


5 


. • .008 


.007* 


.009*. 


,05 


.09 


.12 


6 


.014* 


. .012* 


.015* 


.07 


.14 


.1-7 


7 . 


.018* 


.017* 


.021* 


.09 


.19 


.23 


8 


.005 ' 


.018* 


.019* 


.03 


.22 


.24 


9 


.022* 


.010* 


^ .014* 


.12 


• .11.- 


.17 


10 


.003 


.007* 


.008* 


.02 


.09 


.09 



* Coefficient exceeds twice its standard error. 



Table 6. Regr^ession of lEA bi'ology items on students' liking of biology. 



EFFECT- ESTIMATES. 



0 

Unstandardized , . Standardized 



Item 


Between 


Within- 


Total 


Between 


Within ■ 


. Total 


2 


-.025 ' 


.024 - 


.021 


-.01 


, .04 


.03 


3 ■ 


.163* • 


.066* 


.085* 


.09 


.10 


.13 


4 


.099 


.045*"" 


^ .056* 


.05 


.07 


.09 


,5 ^ 


.133* 


-.010 


.006 


.08 


-.02 


.01 


6 


-.039 


,084* 


■ .079* 


-.02 


.13 


.12 


7 


.039 


.085* 


. ..090* 


.02 


.13 


.14 


8= 


-.012 


.108* 


.106* 


-.01 


.18. 


.18 


9 


.044 


.072* 


.077* 


..02 


.11 


.12 


10 


-.013 


.076* 


.077* 


.01 


. .12 


.13 



<9 



*Coefficient exceeds twice its standard error. 
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Table 7, Regression' of lEA biology items on number of books in the 
home. 



EFFEGT-. ESTIMATES 







Unstandardized 




Standardized 


Item 


Between 


Within 


' Total 


„ Between 


Within 


Total 


2 


-.011 


.006 


..005 


-.01 


.01 


.01 


" 3 


■ .193* 


.051* 


.073* 


.11 


- .09 


.13 


4 


.190* 


.063* 


.084* 


.12 


.11 


.15 


5, * 


.097* 


'.026 


.037* 


.a? 


- .06 


.08 


6 


.118* 


.067* 


.080* 


.07 


.11 


.14 


7 


• .*247* 


.020 


.048* 


.14 


.03 


.08 










.05 




8 


.162* 


.027 


.045* 


.11 


.09 


-9 


.137* 


.021 , 


.037* 


.08 


.04 


.07 


10 


.132* 


.011 


,026 


.08 


.02 


.05 



* Coefficient exceeds twice its standard error. 



Table 8. Regression of lEA biology items on biology instruction. 



" EFFECT ESTIMATES- 

J 







Unstandardized 




Standardi zed 




Item 


Between 


Within 


Total . 


Between 


Wi thi n 


Total 


2 


-.086* 


.019 


-'.002 


v. 08 


.03 


-^00 


3 


.032 


.063* 


.071* 


.03. 


f' 


.12 


4 


-.038 


.038* 


^29 


-.04 


,.07 


.05 


5 


.b36. 


.003 


.012 


.04 


' '01 


.02 


6 


-M6 


.059* 


.047* 


-.04 


.10 


.08 


7 


-;'014 

f 


• .047* 


\044* 


-.01 


.08. 


.08 


8 


' 1 006 


,043* 


.044^* 


. .01 


.08 


.09 


9 ' 


[oil 


.012 


.014 


. .01 


.02 


.03 
.08 


10 . 


-.023 


.047* 


.041* ' 


-.02 


.09 



i 



1 « «. 

♦Coefficient exceeds twice its standard error. 



Table 9. Regression of lEA biology items on biology homework. 



■ • EFFECT ESTIMATES- 

' Unstandardized Standardized ' 



Item 


Between 


Within 


Total 


' Between 


Within 


Total 


2 


-.066 


.024 


.009 , 


. , -.05 


.04 


.02 


3 


.111* 


.027 


.051* 


.08 


„ .04 


.08 


4- 


-.007 


.040* ' 


.039* 


-.01 


.07 


.06 




;P59 


-.008 


.005 


.05 


-.02 


.01 


6 


-.036 


.057* 


'049* 


-.03 


.09 


.08 


7 


-.004 


.056* 


.055*' 


-.00 


.09 


.09 


8 


.023 


.038* 


,.043* 


.02 


.07 


.07 


9 


<> 

.023 


.032 


.037* 


.02 


.05 


.06 


10 


-.022' 


.057* 


.052* 


- -.02 


.10 


.09 
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Table 10a. Descriptive statistics of BTES fractions subtest - occasion B. 



2 

Mean Standard Deviation n . 



Item 




Total 


Between 


Within 




1 


.58 


.50 


. .30 


.39 


.37. 


2 


.54 


.50 


.30 


.40 


.36- 


3 


.58 


.50 


.23 


.44 


.21 


4 


.54 


.50 


.26 , 


.42 


.28 


5 


.16 . 


.37 . 


.18 


.32 


.25 


6 


.50 


.50 

-4 


.CO p 


.*to 


97 

. Ci 


7 


.42 


' .50 


.31 - 


.31 


^ .39 


8 


.13 


.OH 


. 14 


• O 1 


1ft 


9 


.09 


.28 


.12 


.26 


<« 

.19 


10 


.47 


.50 ' 


,.25 


.43 . 


.25 


11 


.42^ 


.50 


.25 


.42 


.26. 


12 


.31 


.46 


.24' 


.40 


.26 


13 


.21 


.41 


.19 


.36 


.22 


14 


.41 


.49 


.22 


. -44 , 


.20 












15 


.27 


.45 


.23 


.39 


.25 


Total 


"^.63 


3.47 


■ 2.40 


2.51 


.48 


Test 






9 • 
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Table 10b. Descriptive statistics of BTES frictions subtest-occisioh C. 



Mean Standard Deviation n 



Item ^ 




Total 


Between 


Within 






.50 


.37 


.21 


. .31 


.32 


2 V 


.'83. 


' .38 


.21 


.32 


.32 


3 


,60 <^ 


.49 


.22 


.44 


-.20 


4 


.75 


.44 


.19 


.39 " 


.19 


% 


..39 


.49^ 




.41 . 


.31 


6 


,73 


• .45 


.22 


/ ,39 


.24 


•7 , 


.67 


.47 


.26 


' .39 


.31 


8 


.28 


.45 


.31 


.32 


.48 


9, 


..26 


• 

.44 


• .26 


.35 


.36 


10 


.60 


.49 


.31 


.39 


. - 39 


11 


.62 ' 


.49. 


'•27 




.29 


12 


- .54 


.50 


"' .30 


.40 • 


.36 


13 


.36 


.48' 


.25 


.42 • 


.27 


14 


" .59 ■ 


, .49 


.33 ' 


.37 


'A5. 


15 


.36 


.48 


.27 


.40 


.32 


Total 
Test 


8.08 - 


3,63 


. .2.41 


2.72 ' 
/ 


* .44 



0 



' • - ' r 

Table 10c. Descriptive statistics of BTES fractions subtesj-occasion D'. 



2 

Mean ' Standard Deviat-toh n 



Item 




Total 


Between 


Witnin 




1 
1 


.74 


> 

.44 


,.'22 


.39 ^ 


^ .24 


2 


.77 


.42 


.20 


.37 


00 

. > 


3 


.58 

1 


.50- 


.16 


* .47 

1 


. 10 


4 


M it" 


.45 


.27 


.36 


.3/ 


5 * 


' .24 , 


' .43 


.25 


..35 


) .00 


6 


.61 




.26 


, .42 


.CO 


7 


.53, • 


%50 


.27 


.42 




8 


.33 


.4r 


•27 


..39 


" ^.32 


9 


.31' 


.47 ' 


•23 


.50 


.23 

* 

V 


10 • 


■ .60 


.49 


.26 


.42 


.27 


11 


.54 


.*50 


' .21 


^ .46 


.17 


12 


' .61 ■ 


.49 


.22 


.44 


.;?o" 


13 


.36 


.48 


.23 


.43 


.21 


14 


.63 


.49 


.23' 


.43 

✓ 


.23 


15- 


.49 


.50 ' 


.28 


.42 


.30 


Total" 


8.06j 


3.74 


. 2.40 


2.87 ' 


.41 


Test 
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Table' n.- BTES fraction subtest - Corrected part-whole correlations'. 



Total Between - Vi,thin 



_ 




■ ._ 




— _ 


— - : . 


- 






Item 


o 

B' 




D 


B 




D 








1 


.552 


.486 


.485 


.601 , 


i.549 


.737 


.444 


.439 


.354 


2* 


.563 


.343 


.419 


.725 


.400 


.701 


.429 


.320 


on o 

.293 


3 


.169 


.194 " 


.267 


.394 


.078 


.299 


-.014 


.257 


.219 


4 


.415 


.388 




r 

.579 


.313 


.396 


^287 


.371 


.203 


5 


.035 


.260 


.548 


-.148 


.589 * 


.807 


-.027 


.121 


.366 




.523 


.419 


.353 


..616 


.492 


.551 


- ^.404 


.414 


.269 


7 


.660 


.410 


con 








559 


420 


.540 


.'a 


. .549- 


.624 


.632 


.667 


.769 ; 


.810 


.449 


.519' 


4 <i 

•577 


9 


.337 


.595 ■ 




• DUD 






281 


.454 


.457 


.10 


.457 


.372 


.544- 


.476 


.476 


.630 


.372 


.319 


.466 


11 


< .511 


.521 


.382 


.•288 


.630 


.191 


.509 


.438 


u274' 


<?I2 


.375 


.443 


.320 . 


.595 


.622 


.351. 


, .22] 


.368 


.256 


13* 


.195 


.37(!) 


.380 


.389^ 


* 

.666 


.475 


.139 


.242 ^ 


.318 


1^ 


-.303 


•.385 


.320 - 


.571 


.428 


.094* 


.137 


.304 


-:342 


15 


.337 

3 


.536 


.- .345 


.491. 


,.583 


.359 


.210 


.395 

• 


.249 



Table 12. Regres.lon of BTES fraction Items occasion C on fraction Items occasion B (PRE)-«nd allocated time (A.T,), 



UNSTANOARDIZED 



STANDARDIZED 





Between 


0 ' 


Ulthth 






Total 


< Between 


Within 




Total 




ItOi 


* ^ PRE 


A.T... 


PRE 


A.T. 


PRE 


A.T. 


PRE 


A.T. 


PRE 


A.T.- 


t'RE 


A.T. 


. 1 


,012 


.118 


.048 


.100 


.125* 


.115 


.02 


• .23. 


.04 


.13 


.24 


.15 


2 


J07^ 


.030 • 


-.128 


.162 ' 


.113* 


.114 


.18 


.06 


• -.10 _ 


.21 


.21 


.15 


3 


-.092 


.077 


, -.151 


.392* * 


.001 


' .363* - 


-.12 


.1.1 


-.07 


.39 


.00 


.36 


4 " 


.047 


.108 


;412* 


.098 


.134* 


.197* 


.07 


.1^ 


• .25 


.11" 


.24 • 


.23 


5 


-.004 


.222 


' .319 


.343* 


.236* 


.420* 


-.Ul 


- .34 


.12 


.25 


.36 


.31 


6 


^050 


.140 


' .296 


.005 , 


.094 


.094 


-.07 


.23 


.17 


.01 


.15 




7 


-,002 


.063 


.124 


"■..177 


.055 


.255* 


-.00 


I .10 


.08 


■ .18^ 


.08 ■ 


.24 


a 




.051 


.55^ 


.284* ^ 


.170* 


.374* 


.23 


.17 


.21 


.27 


.26 


-J 


Z96*l 


^*012 


.243 


.227 
.124 


.198*. 
.181* 


f^51 
.246* 


.42 


-.02 
.27 


.07 
.24 


.14- 
• .'13 


.33 
.28 • 


.16 
.25 

* 


in 


.030 


.174 


.482* 


.04 


11 


...172 


.200 


•Ml 


.305* 


.133 * 


.387* ' 


'•.22 


.29 


.23 


.31 


.19 


.40 


12 


.041 


.162 


.322 


.257 


.197* 


.353* 


.05^ 


.23 


.15 


.24 


.28 


.33 


13 


.079 
.318 


.162 


1570 


t168 Q ^' 


.^12* 


. .287* 


.10 
/.41. 


.22 


.15 


' .31 


.25 


14. 


-.071* 


.549* 


-.127 


.181* 


.012 


.-.11 


.25 


. -.13 ' 


.28 


.01 


15 


..288 


-.076 


-.007 


.342* 


.134* 


.385* 


1 .38 


-.12 


-.00 


, .32 


.21 


.36 



♦Coefficient exceeds twice its 'standard error. 
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Table 13.' Regression of BTES fraction Items occatlon C on fraction Items occasion B (PRE) an 



Between 



UNSTANDARDIZED 
Within 



Total 



Between 



STANDARDIZED 
-Wthin 



Total 



1 vcn 


POP 


CI. 


POP 


F T 


pnr 


F T 


PQF 


C T 


Pot 

rKL 


C T 
C. 1 • 


rKt 


t»l. 


l' 
1 


- IM* ' 




« ooa 


,105 


.022 


» .087 


— . 


19 


t 


11 


no 


.11 


C 


• 171* 


114* 


* in 

— ". U 1 


Ififi 


*».00Q 


.102 


. i;i; 


ik 
. w 


. In 


• CC 


-•UU 


14- 




nisi 


n9i 

— .Uc 1 


— . too 




— »ua^ 


371* 


1^ 




no 
-.08 


. J7 


V 

-•17 


.37 


4 


.010 


•026 


.351 


.108 


.037 


.192* 


.03. 


^ .10 


.21 


.13 


.14 


•22 


5 ' 


•-036 , 


.054 


.561 


,349* , 


.038 


.496* 


-.10 


.18 


■ ^ .21 


.26 ' 


.12 


.36 


6 


-•033 


.053 


.247 


.266 


.032 


.090 


-.10 


.19 


.15 


.03 


.11 ^ 


.10 


7 


-.018 


.014 


.093 


.188 


.002 


.221* 


-.05 


.05 


.06 


.20 


.01 


.^3 


8 


-.030 


.032 


.500 


.279 


.009 


.358* 


-.08 


.09 


.15 


.20 


.03 


.26 


9 


-.034 


.048 


.169 


.230 


.020 


.261 


-.10 . 


.14 


.05 


.14 


.06 


.16 


10 


.058 


-.019 


.445 


.127 


.025 


.244* 


.16 


-.05 


,23 ' 


.13 


.07 


.2S 


11 


-.103 


.090 


.474* 


\293* 


.005 


.442* 


-.30 


.30 


.25 




.02 


.45 


12 


-.12S 


.099 ' 


.431 


.181 


.001 


.339* 


-.37 


* .33 


.21 


.17 


.00 


.32 


13 


« -.085 


.090 


.574 


.138 


.014 


.293 


-.26 




.22 


.12 ' 


.05 


rr .26- 


14 


-,6l2 


. .028 


.631* 


-.114 


.019 


.013 


-.03 


.10, 


.29 


-.11 


.06 


^ .01 


15 


i 

-.123 


» » 
.117 


:268 


.330* 


.021 


.416* 


-.37 


.40 


.13 


.31 


.07 


'>".39 



'coefficient exceeds twice its standard error. 
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Tible 14. Regression of BTES fraction Items occasion C on fraction Ite^is occasion B (PRE) and hard tine (H.T.). 



Si 



UHSTAND/WDIZEP 





Between 




Within 




o Total 




I tew 


* PRE 


H.T. 


PRE 


H.T. 


PRE 


H.T. 


\ 


.017 


-.030 


.023 


.082 


-.020 


.095 


2 


.022, 


-^.047* 


-.116 


.139 


-.036* 


.103 


3 


-.032 


-.016 


-.261 


.376* 


-;032* 


.332* 


4 


.016 


-.005 


.373 


.111 


.006 


.207* 


5 


.042 


-.029 


.545 


.358* 


-.001 ' 


.503* 


6 




-^009 


.270 


.019 


-.P05 


.096 


.7 


-.052 


.021 


^ .116 


.210 


-.008 


.226 


8 


r.008 


-.008 


.480 


.274 


-.013 


.348* 


9 


.004 


-.020 


.167 


.249 


-.017 


•282 


10 


-.013 . 


.022 


.440 


.134 


.010 


.245* 


11 


.056* 


-.054* 


.487* 


.290* 


-.021 . 


.430* 


12 


.021 


-.029 


.395 


.193. 


-.014 


.327 


13 


.031 


-.017 


.577 


.161 


.001 


.294* 


14 


.014 


-.006 


.627* 


-.119 


*.005 


.010' 


15 


.043 


-.040 


.285 


' .328* 


^ -.015 


.416* 



Between 



STAHDARPIZEP 
Within 



/ 



Total 



PRE 


H.T. 


PRE 


H.T. 


PRE 


H.T. 


.11 


-.25 


.02 


.11 


-.17 


.12 


.12- ^ 


-.37 


-.09 . 


.18 


-128 . 


.13 


-.14 


-.10 


-.12 


.37 


.*.20 


.33 


.09 


-.0? 


.22 


.13' 


.04 


.24 


.20 


-.19 


.21 


.26 


-.01 


.37 


.02 


-.06 


.16 


.02 


-.04 


.11 


-.26 


.14 


.07 


.22 


7-05 . 


.24 


-.04 


-.05 


.15 


.20 


-.09 


.26 


.02 


-.15 ^ 


.05 


.16 


-.13 


.18 


-.06 


.15 


.22 


.14^ 


.07 


^ .25 


.29 


-.37 


.26 


.30 


-:i4 


^ .44 


.11 


-.20 


.19 


.18 


-.10 


.31 


.17 


-.12 


.22 


.14 . 


.00 


.26 


.07 


-.04 


.29 


-.12 


.03 


.01 


.23 


- -.28 


.13 


.31 


-.11 


.39 



*Coeff1c1ent excels twice Its standard error. 



45 • 
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Al 



Biology Test Items Jorm B 



•Target 
Popul. 



Coiucm Bcha- Average NJed. Popul. Effect. 
Area vior Facility Dijcrini. Di^rim. Distr. 



Easier in 



Harder in 



Population II-Tcsi 4 B 



1»* In M ctptfffMtfnl w^re in a jtr tn^ 

tvfiw^ <i«vu4f k)r th«^ |M tiiil 'fonnexl. in thf jar. 
Whic)i Ihe MIowifif f llic Wsl cipUntlit* •! 
iHU mult? 

0« Ojwa»«s«Uuj»iArr^tr«tioii 

m COf ^laJuct^ k)r ^o<M]r iit.Sf«U 

JL \t!km WMfiHt ibull tf •» Mimtl u rhool. Hi* 
iMcKcr MiJ ^^.f A.i| ItAtw tthat thr animtS w»t . 

tha vM sure tl.«t it anc tn«t ^rcjred tlhcr 
tniiiMU far iu fcr»l Which elu*. 7*u think. M 

The U(tt]tw«« much Un(tf than ttvMwi^ 
C Theft wu • •raj«ttinf ri^a aladf tW to^ al th« 
•kttll 

* n. Fmu afth«t^h were kxif and f«Utfd 

& TW jaws Mttki w«rh tidtwava is *>cn «• ap and 



T«« wa»!«l U l««ni which af thraa ljrf«» af 
tUf. Mn«}. af lM«»->»Mld h« but for gr6wiitg 
Wana Ha r*«ir>d ihric .^avrpou. ^t a diffartnt ryff 
•I Mil in lacH ^t. ard ^Unt«l th« aan* aunhcr a{ 
Wana at •hawn in th« dra«rtf>|r. He fiarad 

•*>l* Vf M*U '<Mi tK« wliwlaw till and gav« aaeh 



II 



II 

(IB.6) 



II 

(IA.8) 



Biol. 
(13) 



Under. 36.4 



i27 



Biol. 
(12) 



'Applic. 63.3 33 



Biol. 
(13) 



Under. 54.8 



29 



India 
Iran. 



Belgium (Fr) 

Finland 

luly 



FRC 
' Finland. 
Netherlands 



Scotland 
Thailand' 



Belgium (Fr) Belgium (Fl) 
Netherlands FRG 
USA . Hungary 
Japan 




III' 



Vky waa T«M*a ai^riaiMt SOT a |aW a«« far hU 

A. T>« plafiU in AM po< ga< wm% iimJishl th«i« Um 
jjlala ' bnha that y 
• B. T1k«aia««iilaftotliA(4«h^( wat naClhtMiiM 
C 0«M^*hauIdh«vfl<cii placed in the dark 
O. Taa ahmdd Jia«a y«cd diflf rent aaiaunli af walar 
X» lWfl»t«^v««Ufttl««h«CMitAa«iad«ir»tQ 



5. tiM drawing rtff cMfii* a ^nC celt, (n which ;af tha II 
fMr rtgiM* Marked nufht ehlara|»Iaal« h« faund? 




A. 

C l«¥aikKaidL 
C YW «Mvtr fM flM4Mf ndMaia it ftntr^r •Mm' 
A* tMae^^ijfMi 



Biol. 
(14) 



Inform. 17.1 



17 



Finland 



II Biol. 
(UB.I) (13) 



Inform, 62.1 



25 



/ 

Hungary 



X 



11 



J 



Chile Belgium JFl) 

Intiia Belgium (Fr) 

Iran , FRG 

New^eabnd 
ThiiiUind 



• I 



A2 



X 



Biology Test Items - Form B 



-Target Content B,cha. Average Med. PopuK Efrcci. 
Popul. Area vior Facility Oi^rim . Discrim. Distn Easier m 



Harder in 



Biol. 



Under. 48.7 



36 



Thailand 



Netherlands 




D. 0 ii«arWii dioxide an^'^ U •tygcn 

E. 0i«c«rWndj9siJ««n4<l^ i»c«fWA]rdf«U 



C NiUofen 

D. Yiumtiil . 

9' 



k« A»d«» •« hith itt»«iit««i» in S«ith A»n*Tici aW 



9. l'k«Ajid««a 

#ieir »»fc«k»«ni* «»« "m* "-f - — 

TVf* h»v« idnoK t*ic* »• Buay frd ft- 



II 



II 



Biol. Inform. 21.8 SI B, D, E C 
(H) ' . 



Hungary Japan 
Italy 



Biot. 
(17) 



Higher 20.8 19 



D.E 



India 
Iran 



Belgium (Fl) 
Sweden 



,^M«rir> f*»» W f f»dtlCt4 flWf* ^ickly 

tN Atr •( the Ajtd«» tht iiiK«kiuiiU kfc«tht mrt 
d«9fly in arder to iMrm lli«^lM«t aaount •( 
•iftM iJi theit luA|» 

U Uit Aikiet thtfi is !cm •«]r|M cntcnnK tht 
•! iht ir.h jkitanu thil «n incftii* i« th« 
MsWf af rrd c«ffuKlc« fntkin • Ufg^f pr*- 
Mrtkm •( tl»t« •«>! en !• ^ Wd 
.D» Uk«kiunts •(the And« iiftJ av>rt f««i e«r. 
ywici— t« tr«A»^rt •«]r(iM thrtMph tht W««d 
v«mI« W<M*f« th«rt 1* Im* •»]r|cii.iu tht «if ihtf 
Wtsikc 

Ei tW !•«•« #ir f f ti*Mr« in th« AiUm cauvs hM 
a» eirraUu Mft ^ickl)r thfM^ h th* Uo«d vf>> 
Mb and M Mft rad cac^KM* Mt iMtdcd t« 



Afl tff like r«U««» init •» ttiKrto tf th« rtf f«dtftti«« 
Mfcm. t klch •»« •( thtm wa^ nctuf b<rt>f« »• 
Mil ha MfMin that ftrtiUtaii^N hit l«kaa 
A. AiMiaat9aiii«Ai(ii«MA«diw«t« 

• C TW uttcbtt* •( • fn«l« famtla miHt fiiM with that 

•I a ftMiaW f amtu 
Ou AiMfiMi«t<MMiiiMNtmcha«rfCc«U 
C» A iMMle «afii«t« mmM hava a tl«ta •! UaJ lar 

in— hr|» 



11 



Biol. 



Inform. 33.D 18 D D 



(10.V.7) (Itt) 



Belgium (Fl) Japan 
Iran 



ERLC 



48 



PIP-— 



APPENDIX B 



ERIC 



STANDARDIZED 

Between , Within Total 





H.T. ' 


PRE 


H.T. . 


PRE 


H.T. 




-.25 


.02 . 


.11 


-.17 


.12' 


.12 


-.37 


-.09 


. .18 ^ 


-28 


.13 


-J4 


-.10 


-.12 


. .37 


.-.20 


.33 


.09 


-.03 V 


.22 


'P 


.04 


.24 


.20 


-.19 


.21 


.?'6 


-.oi 


.37 


.02 


-.06 


.16 


.02 


-.04 


.11 


-.26 


.14 


.07 


.22 


-.05 


.24 


-.04 


-.05 


' .15 


.20 


-.09 


.26 


.02 


-.15, 


. .05 


.16 


-.13 


.18 


-.06 • 


.15 • 


.22 


.14 


.07 


.25 


.29 • 


-.37 


.26 


.30 


'-.14 


• .44 


.11. 


.--.20 


'.19- 


« . .18 


-.10 


.31 


.17 


-.12 


.22 


.14 


.00 


.26 


.07 


-.04 


.29 


-.12 


.03 . 


.01 


.23 


*r.28' 


.13' 


.31 


-.11 


. .39 



STANDARDIZED 

Between Within ' Total 



PRE 


A.T. 


PRE 


•A.T. 


PRE 


A.T. 


.02 


.23 


.04 


.13 


.24 


• .15 

1 


.18 


<.06 


-.10' 


.21 


' .21 


.15 


f.l2 


.11 


-.07 


.39 


.00 


.36. 


.07 


.19 


.•25 


.11 


.24 


.23 


-.01- 


.34 


.12 


.25 


.36 * 


' .31 


-.07 

* 


.23" 


.17 


.01 


• .15 


.11 


-.00 


.10 


.08 


.18 


.08 " 


• .24 


.23 


.08 


- .17 


.21 


.27 


.28 


.42 


-.02- 


.07 


.14 


.33 


.16 


.04 


.V 


. .24 


.13 


.28 


.25 


-.22 


.29 


.23 


.31 


.19 


.40 


.05 


.23 


■ ' ;15 


^24 


.28 . 


■ .33 


*.10 


.24 


.22 


.15 


. .31 


.25 


.41 


-.11 


.25 


-.13 • 


.28 


iOl 


.38 


-.12. 


-.00 


.'32 


.21 


.36 



\ 



Table 14. Regression of BTES fraction items occasion C on fraction items occasion's 



UNSTANDARDIZED 





s 

Between 


* 


Within 




Total 




Item 


PRE 


H.T. . 




H.T. 


PRE 


H.T. 


1 


.0>7 


-.030 


, .023 


■ .082 


-.020 


.095 


2 V 


' .022 


-.047* 


-.116 


.139 


-.036* 


.103 


3 

3 ^ 


-.032 


-.016 


-.261 


.376* 


-.032* 


.332* 


4 


.016 


-.005 


^ .373 


.111 


.006 


.207* 


• 5 


.042 


-.029 


.545 


.358* 


-.001 


.503* 


6 


.003 


-.009 


.270 


.019 


-.005 


.096 


7 


_-.052 


.021 


.116 ' 


.210 


-.008 


.226 


8 


-.008 


-.;008 


.480 


.274 


-.013 


.348* 


9 ' 


.004 


-.020 


.167 


.249 


-.017 


.282 


10 


-.013 , 


.022 


.440 


,134 


.010 ~ 

1 


;-.-245* 


11 


.056*- 


-.054* 


.487* 


.290* 


-.021 


.430C 


12 




-.029 


• .395 


:193 


-.014 


.327 


1^ • 


.031 


-.017 


.577 , ' 


".161 


.001 


.294* 


14 


.014 


-.006 


.627* 


-,119^ 


.005 


.010 


15. 


.0X3 


-.040 


-.285 


.328* 


-.015 


.416* 










^r- 







♦Coefficient exceeds twice its standard error. 
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Table 13. Regression of BTE$ fraction Items occatlonj C on fraction items occasion B (PRE) and easy time (E.T.). 



TSI^ 

' UNSTANDARDiZED, 



-Between .Within 1 Total 



Item 


PRE 


E.T. 


PRE 


E.T. 


\ PRE 


E.T. 


1 


-.108* 


.097 


■ -.008' ' 


,105 


.022 


.087 


■2 


-.171* 


-.114* 




.168 


V.QOQ 


' .102^ 

1 


3 


-.051 . 


-.021 


-.188 


.395* 


-^Q53 


.371*/ 


4 


.010 


. v026 


.351 


.108 


.037 


.192* 


5 


-.036 ^ 


.054 


.561 


' .349* 


.038 ' 


.496* 


6 ^ 


-.033 . ■ 


.053 


.247^ 


.266 


.0,32 


.090 


7 


-.018 


.014 


.093 


.188 


• .002 


.221* 


8 


-.030 


.032 


.500-." 


.279 


■ .009 


.358* 


9 


-.034 


.048 


.169 


.230 


.020 


.261 


OQ , 


.058 


-.019 


.445 


/.127 


.025 


.244* 


11 


-.103 - 


, .090 


■474* 


.293* 


.005 


. .442* 


12"' 


r.l26. • 


.099 


.'431 • 


.181 


.001 


.339* 


13 


\-.085 


.090- ■ 


* 

.574 


* .138 


.014 


.293 




; \ 


.028 


.631* 


-.114 


.019 


.013 


15 . 


-.123 - 




.268 


- .330* 


• .021 


:416* 



I 

^Coefficient exceeds twice its standard error. 56 



Table 12. Regression of BTES fraction items occasion C on fraction items occasion B CPREl and allocated time CA.^.. 




5? 

ERIC 



Between . 



UNSTANDARDIZED 
Within 



Total 



♦ * 

♦Coefficient exceeds twice its standard error. 



Item 


DDC 

rKc 


A T 




A T 


PRE 

1 lib* 


A.T. 


1 


m 0 


* 1 1 o 




inn ' 


125* 


.115 


o , 

C ^ 


, lU/ 




— ♦ 1 CO 


' 162 


113* 


.114 






077 




392* ^ 


OQl 


.363* 


4 




• lUcJ 




nQR 


134* 


197* 


5 


' -.QQ4 


,222 


.319 


.343* 


,236* 




6 


- .050 






nnR 


094 


.094 


7 








177 


,055 


.255* 


o 


ICQ 

. loo 




^(^^ 


284* 


.170* 


; .374* 


9 


.296* ^ 


^ -.012 


;243. 


.227 


.198* 


on 

.251 c« 


-10 

* 


.030 


.174 


.482* ^ 


.124. 


.181* 


.24B* 


11 


-.172 


.200 . 


'.441 


.305* 


.133 


• .387* 


12 


.041 


.162 . 


.322 


\ .257 


.197* 


•N/ .353* 


•13 


.079' 


.162 


' .570 


\f68 


.212* 


.287* 


14 


.318 . 


^ -,071' 


.549* 


-.12\ 


.'TBI* 


.012 


\ 15 


.288 ' 


> 

-.076 


-.007 • 


.342*\ 


.134* 


.385* 
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